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Abstract 

A  ground  based,  microwave  radiometer  system  is  being  completed  at 
IRL  which  will  be  used  to  measure  mesospheric  water  vapor.  The  present 
study  addresses  itself  to  the  basic  radiative  transfer  of  this  experiment 
and  to  the  interaction  between  the  atmosphere  and  electromagnetic 
radiation. 

Using  a  classical  mathematical  analysis  of  the  data  inversion 
process  an  estimation  of  the  true  information  content  of  the  received 
data  is  produced.  This  process  depends  critically  upon  the  structure 
of  the  weighting  functions  as  was  anticipated. 

The  result  of  this  study  is  that  the  present  radiometer  system 


should  have  four  clearly  independent  pieces  of  information  per  profile 
with  a  fifth  piece  possible,  for  realistic  estimates  of  system  errors. 
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1 .  Introduc tion 

Information  on  the  vertical  profiles  of  each  element  existing  in 
the  atmosphere  can  provide  a  better  understanding  of  the  atmosphere.  This 
is  especially  true  for  the  water  vapor  content  which  plays  a  dominant  role 
in  photochemistry  in  the  middle  atmosphere  (stratosphere  and  mesosphere) . 
Therefore,  investigating  the  concentration  of  water  vapor  in  the  atmosphere 
is  very  important. 

There  are  two  major  ways  to  do  the  measurements:  in-situ,  and  remote 
sensing  techniques.  The  vertical  profiles  of  H2O  in  the  lower  atmosphere 
can  be  determined  by  balloon  sounding.  However,  in  the  upper  atmosphere 
where  the  H2O  content  is  much  less  compared  with  the  H2O  in  the  troposphere, 
contamination  may  cause  difficulties  in  determining  the  correct  amount. 
Therefore,  the  remote  sensing  technique,  which  can  allow  us  to  study  the 
atmospheric  region  without  disturbing  it,  is  a  very  attractive  solution. 

Here  we  choose  the  microwave  rather  than  the  IR  remote  sensing  technique 
to  investigate  the  H20  content  in  the  upper  atmosphere.  The  major  reason 
is  that  the  characteristic  spectrum  of  H20  has  many  more  and  closer  spaced 
absorption  lines  in  the  IR  region  than  in  the  microwave  region,  and  the 
collisions  among  air  molecules  and  gases  will  significantly  broaden  the 
absorption  bands,  thus  the  overlapping  among  the  absorption  bands  decreases 
the  vertical  resolution.  Besides,  the  microwave  radiometer  can  detect  at 
lower  power  level  and  penetrate  through  clouds  which  are  opaque  to  IR, 

According  to  kinetic  theory  (Goody,  1964),  the  lines  of  the  char¬ 
acteristic  spectrum  of  each  molecule  will  be  broadened  both  by  collision 
effects  between  molecules  and  by  Doppler  broadening.  The  former,  depending 
on  the  pressure  of  gas,  dominates  in  the  lower  atmosphere  and  decreases 
exponentially  with  height,  while  the  latter,  depending  upon  random 
molecular  motions  hence  temperature,  contributes  significantly  only  at 
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levels  above  80  Km.  The  H2O  content  in  the  atmosphere  is  found  mostly 
within  the  troposphere  having  only  a  very  small  contribution  at  higher 
levels.  The  corresponding  spectrum  of  this  H^O  concentration  should  be 
much  more  smooth  and  broadened  at  lower  levels,  narrowing  to  a  small 
amplitude  but  much  sharper  peak  at  upper  levels.  Such  differences  in  the 
half-width  of  the  spectral  peak  can  allow  a  microwave  radiometer  to  be  set 
up  at  ground  level  and  measure  the  radiation  being  absorbed  or  emitted 
from  H2O  throughout  the  atmosphere. 

2.  Radiative  Transfer 

Water  vapor  has  only  two  characteristic  lines  existing  in  the  microwave 
region  —  22  GHz,  183  GHz.  The  183  GHz  line  is  much  more  intense  but  its 
attenuation  through  the  troposphere  is  so  strong  that  it  can  only  be  used 
from  platforms  aloft,  such  as  by  satellite  through  a  limb  viewing  measure¬ 
ment.  For  a  ground-base  microwave  measurement  of  H2O,  one  must  use  the 
rotational  line  centered  at  22  GHz.  The  vertical  resolution  is  determined 
by  the  spectral  line  width  and  the  bandwidth  of  the  radiometer,  thus 
setting  the  lower  height  limit  of  H2O  content  one  can  determine.  (The 
radiometer  in  question  is  constructed  with  a  filter  bank  centered  at 
22.235  GHz,  covering  a  half  width  of  2.5  MHz,  using  50  channels; 
the  predicted  measurable  height  range  is  50  to  85  Km.) 

The  wavelengths  of  the  microwave  region  range  from  10  cm  to  1  mm, 
which  is  in  the  Rayleigh- Jeans  region  of  the  Plank  black-body  function. 

This  provides  a  simple  relationship  between  the  emission  power  of  the 
medium  and  its  thermal  absolute  temperature;  that  is,  the  emission  power 
is  proportional  to^  the  temperature  and  also  to  the  concentration  of  the  gas 
at  that  level.  The  opacity  of  the  atmosphere  to  radiation  is  due  to  the 
absorption  and  scattering  of  air  molecules  (through  their  vibrational  and 
rotational  motions),  and  the  interaction  between  air  molecules  and  radiation 
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field  may  emit  quantized  energy.  Since  the  scattering  effects  are  much 
smaller  than  the  other  two,  only  the  absorptions  and  emissions  are  con¬ 
sidered  here.  Assuming  local  thermal  equilibrium  (LTE)  in  the  atmosphere, 
the  linear  relationship  between  the  measured  intensity  of  the  radiation 
and  the  atmospheric  (thermal)  temperature  allows  one  to  specify  the 
radiative  transfer  in  the  following  form,  at  frequency  v: 

.  v  OO 

T^Cv)  =  Ts(v)e  1  +  /  T(v,s) -K(v,s) -e  T^ds  (1) 

h 

T  (v)  =  the  brightness  temperature 

D 

T  (v)  =  the  thermal  temperature  of  an  external  source 
T(v,s)  =  the  kinetic  temperature  of  atmosphere 
K(v,s)  =  the  total  absorption  coefficient 
ds  =  the  optical  path  length 

OO 

and  t(v)  =  /  K(v,s)*ds  r  the  optical  depth  (or  opacity) 
h 

On  the  right  hand  side  of  eq.(l),  the  first  term  is  the  transmission  of 
external  radiation  S  and  the  second  term  involves  the  emission  of  the  medium. 

order  to  simplify  the  nonlinearity  between  the  absorption  coefficient 

and  the  optical  path,  some  approximations  can  be  made.  Consider  the  atmosphere 
to  be  a  series  of  homogeneous,  1  Km  thick  layers,  and  take  the  average  value  of 

temperature  and  pressure  in  each  layer  to  estimate  the  cross  section  of  the 
constituent  (a) .  The  refraction  effects  in  the  atmosphere  which  have  been 
calculated  are  too  small  to  be  included  (Longbothum,  1976).  In  this  case, 
the  effective  emission  of  the  medium  (in  the  radiative  transfer  equation 
which  involves  the  integration  over  the  range  of  interest) can  be  written  as 
the  sum  of  the  contributions  from  each  layer  attenuated  through  all  the 
layers  below.  That  is: 
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fl(v,s)-K(v,s)  e-T(v)ds  i^T.(v).exp  V  T  (v,s  .  )j.  (v ,  (2) 


and  thus  where  is  the  average  temperature  in  the  layer  i,  and 

N  S(V’ht+l) 


N 

t(v)  =  l  t(v,s.)  = 
i=l  1 


Z  f 

i*l  J 


.u  N 

K(v,s)*ds  =  E  [K(v,s^+i)+K(v , s^)  ]  •  As 
i-1  9 


s(\>,h±) 


(3) 


where  N  is  the  number  of  layers. 

Therefore,  the  total  brightness  temperature  is: 

N 


Vv)  =  Ts(v)  exp 


N  ^  i-1 

-I  t(v,s  )'  +  T.(v)  exp  j-  2  T(v,s  )  1  -t(v,s  )\ 

J=1  J  i-1  1  '  j=1  \  1  (4)  J 


For  an  absorption  experiment,  the  second  term  on  the  right  hand  side 
of  the  eq.  (4)  is  negligible.  In  order  to  avoid  the  uncertainties  in  the 
solar  temperature  at  the  wavelengths  of  the  microwave  region,  the  method 
of  calculating  the  ratio  of  brightness  temperature  at  two  different  zenith 
angles  is  preferred.  Operating  the  ground  base  radiometer  at  zenith 
angle  (4>)  lower  than  80°,  allows  one  to  approximate  the  spherical  earth 
geometry  by  a  plane  earth,  i.e.  ds  can  be  written  as  sec  fy'dz;  then  we 
get: 


N 

I  T 

j=l 


ln(Tn  /T-o  ) 

/  \  B*  on 

(V,Z.)  =  7 - 6 - — 

3  (sec.  $2  “  sec.  <p\) 


(5) 


where  we  define: 

T  (v,z^)  ^  I  K(v ,  z)  •  dz 

J z . (v) 

J 

and  the  superscript  1  and  2  stand  for  the  cases  at  different  zenith 
angles  <frj,  and  $2 •  The  quantity  of  the  right  hand  side  of  the  eq.  (5) 
can  be  determined  directly  through  measurements,  thus  defined  as  g(v) 
or  gi#  where  i  stands  for  the  different  frequency  dependence. 
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Generally,  the  absorption  coefficient  K(v,s)  for  optical  path  s  at 
frequency  v  can  be  written  as  the  product  of  the  volumetric  concentration 
of  the  constituent  n(s)  and  the  total  extinction  cross  section  a(v,s). 

The  weak  dependence  of  K(v,s)  on  n(s)  has  been  tested  over  a  range 
which  corresponds  to  volumetric  mixing  ratios  of  1  to  18  ppm.  K(v,s) 
varies  only  from  0.992  to  0.995  Km'1  over  this  range,  to  the  first 
approximation  it  is  separable.  Thus  we  apply  the  mean  value  for  each 
layer  and  the  optical  path  becomes 


fVi(u) 

=  I  K(v, z)  dz  =  [n  (z^)' 
Zj  (v) 


q(v,Zj)  +  a(v,Zj+1) 


WT\ (2^ ) -n(Zj) 


]  *  Az  . 
J 


(6) 


A  z 

where  WF^z^)  is  defined  as  —  *  [a(v,z^)  +  a(v,Zj+^))  and  is  called  the 
ith  weighting  function.  Therefore  the  relation  becomes. 


N 

gi  =  2  WW-n  (O 

j=l  J  J 


(7) 


Consider  the  emission  experiment;  the  second  term  on  the  right- 
hand-side  of  eq.  (1)  must  then  dominate.  This  term  can  be  approximated 
as: 


B 


N  (  M  i-1  ) 

x(v)  =  E  T  (1-e  T(v,Si)).exp^~[  E1  T(v,Sj)  +  1  t(v,s  )](• 
i-Mfl  1  1  J=1  J 


(8) 


whei  ,  T  *(v)  =  the  upper  atmospheric  contribution  to  the  brightness  temperature 
B 

in  which  we  are  interested;  thus  equals  the  total  brightness  temperature,  T_(v), 

D 

minus  the  lower  atmosphere  contribution  which  can  be  thought  of  as  a  base 

line  (and  is  assumed  to  encompass  layer  1  to  laye**  M) .  Since  x(v,s  )<<1 
M  i-1  1 

and  £  t(v,  s  )  »  £  t(v,  s  ),  the  equation  becomes: 

J=1  1  j-M+1  j 
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N  I '  ”  ,  \  As 

V(v)  =  Z  T,expf  .  ,Tl  ’V\  •  [o(v,s  )  +  o(v,s  )]  •  n(s.) 

i=M+l  v  J  J  11 


=  E  WF(v,s  )  •  n(s  ) 
i=M+l  1 


And  here  WF(v,s^)  for  the  emission  case  corresponds  to 
N  _  f  ?  .)  As. 

e  j  I'  -T-  •  +  °Ks.,.):i 

i=M+l  i.  3  J  z  1  1+1 


It  can  also  be  written  in  a  more  general  form  as; 


g,  *  ^  WF .(S.).n(S.) 


j-M+1 


1  J  1 


where  i  stands  for  the  frequency  dependence,  and  s.  =  z.  •  sec  <f> .  With 
an  appropriate  inversion  method,  the  water  vapor  content  (n(z^))  in  either 
case  can  be  determined. 


3.  Information  Content 


Generally,  an  indirect  remote  sensing  measurement  has  the  following 

form: 

b 

g±  =  /  k^(x) • f (x)dx  (11) 

a 

It  relates  measurement  data  to  the  inaccessible  profile  f(x)  through 

the  proper  weighting  function  k^(x)  (or  kernel,  as  stated  in  mathematical 

terms)  distributed  over  the  region  [a,b]  in  which  we  are  interested.  The 

different  i  usually  represent  different  frequencies  at  which  the  measurement 

has  been  made;  and  k^(x)  can  be  some  kind  of  optical  transmission  functions. 

However,  in  using  most  indirect  sensing  techniques  the  atmospheric 
measurements  show  a  certain  degree  of  correlation  which  leads  to  the 

question  of  the  benefit  in  taking  more  data  points.  For  example,  if 


a  measurement  is  said  to  be  predictable.  If  the  value  predicted  is 
within  some  uncertainty  envelope  which  is  less  than  the  experimental  noise 
level,  it  implies  that  this  value  can  be  predicted  better  than 
measured  (within  the  experimental  accuracy).  In  this  case,  it  would  be 
redundant  to  continue  the  measurement.  Therefore,  it  is  worthwhile  to 
investigate  the  actual  "information  content"  of  such  a  measurement. 

As  given  by  the  relationship  shown  in  eq .  (11),  the  dependence  of 
measurements  usually  comes  from  the  physical  properties  of  the  kernels, 
which  may  not  all  be  linearly  independent  for  all  f(x).  In  this  case, 
investigating  the  degree  of  indpendence  among  kernels  will  correspond  to 
finding  the  independence  of  the  measurements,  thus  to  determine  the  extent 
of  the  information  contained.  There  are  two  advantages  of  looking  into 
the  independence  of  the  kernels.  First,  in  view  of  the  cost  of  adding  and 
analyzing  more  data  points,  determining  the  independence  of  kernels  (and 
thus  the  usefulness  of  those  added  measurements)  can  be  done  before  the 
measurements  have  been  taken.  The  same  process  which  provides  information 
content  can  also  assist  one  in  locating  from  which  channels  the  information 
comes,  thus  avoiding  redundant  measurements.  Secondly,  in  view  of  re¬ 
solution,  the  closer  the  relationships  among  the  kernels,  the  more 
difficult  will  be  the  inversion  of  the  profile  of  f (x) .  To  make  this 
point  clear:  note  that  the  integral  in  eq .  (11)  can  be  approximated  in 
numerical  quadratic  summation  form,  thus  can  be  written  in  matrix  form 
as  G  =  KF,  for  a  finite  set  of  data  points  and  finite  measurement 
intervals.  Highly  dependent  kernels  will  make  the  determinent  of  the 
kernel  matrix  very  small.  Therefore,  when  one  wishes  to  invert  the 
matrix  K  in  order  to  find  the  unknown  profile  F  as  in  K-1G  =  F,  the  error 
(including  the  truncation  error  of  the  computer)  will  be  magnified  so 
much  that  the  information  can  no  longer  be  obtained.  Under  such 
circumstances,  special  constrained  techniques  mav  be  considered. 
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The  theory  of  information  content  of  an  experiment  is  solely  based 

on  the  presence  of  noise  in  the  experiment  and  the  nature  of  the  kernels. 

In  principle,  one  is  looking  for  a  set  of  a's,  which  are  not  all  zero 

N 

simultaneously,  such  that  £  a^  k^(x)  =  0.  Then  one  of  the  kernels 

i  =  1 

can  be  written  as  a  linear  combination  of  the  others,  thus  it  is  predictable. 

However,  the  above  summation  may  not  vanish  in  the  general  case  because 

there  exists  some  uncertainty,  both  experimentally  and  in  the  numerical 

approximations  adopted.  Hence  instead,  we  search  for  the  set  of  a's  that 
N 

minimizes  £  a.k.(x)  and  subjects  them  to  a  chosen  contraint,  say 

N  1=1  1  1 

l  a.2  =  1.  (The  absolute  magnitudes  of  the  a's  are  irrelevant.)  As 
i=l  1 

long  as  the  summation  is  less  than  or  equal  to  the  noise  level  for  all  x 
in  [a,b] ,  one  kernel  can  be  predicted  within  the  experimental  accuracy. 

Thus  the  information  provided  by  this  kernel  will  be  lost  in  the  noise, 
and  the  number  of  independent  pieces  of  information  must  be  reduced  by 
one.  An  appropriate  method  to  minimize  such  a  quantity  (which  is  a 
function  of  x)  is  to  look  for  the  minimum  values  of  its  quadratic  form 

f  N 

q  =  |  £  a  k  (x)  |  2  dx,  this  can  be  written  in  vector  notation  as: 

J  i=l  1  1 


q  = 


k<x)Hk*(x)  jil  dx  =  $*[  k(x)  k*(x)  dx]  a  =  £*C  % 


(12) 


where  a  is  not  a  function  of  x,  and  a,  k(x)  are  column  vectors.  C  is 


the  covarience  matrix  of  Jc,  C  =  [ 


* 


k(x)  k  (x)  dx] .  Applying  the  eigenvalue 


theorem  (Courant  and  Hilbert,  1953)  subject  to  the  constraint  that 


N 

£  a  2  =  1,  the  extremum  values  of  q  are  given  by  X  ,  the  eigenvalues 
i=l  1 

of  the  covarience  matrix  C,  if  ^  is  chosen  to  be  the  corresponding 
normalized  eigenvectors  U^.  Thus  C  *  UAU  and  q  *  ^  C  ^  UAU 

where  U,  A  are  eigenvector  and  eigenvalue  matrix,  respectfully.  Therefore, 
the  smallest  eigenvalue  provides  the  minimum  value  of  q  and  the  magnitude 
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N 

of  Z  a,k.(x)  for  all  x  in  [a,b]  is  fk~  .  (Since  the  covarience  matrix 
i=l  1 

C  is  a  positive  definite  symmetric  matrix,  the  corresponding  eigenvalues 
are  all  positive  and  non  zero.)  If  one  of  the  eigenvalues,  X,  is  smaller 
than  the  estimated  measurement  plus  computational  error,  the  number  of 
independent  kernels  or  information  content  should  be  decreased  by  one. 

If  p  of  the  eigenvalues  are  smaller  than  the  noise  level,  the  number 


should  be  decreased  by  p. 

The  above  statement  can  be  shown  clearly  though  the  effect  of  k^(x) 
in  g^.  Now  consider  the  error  contained  in  the  measurement  g^  as  given 
in  eq.  (11)  ^ 

=  |  ki(x)  f(x)  dx  (13) 

a 

Assuming  is  predictable,  then  the  corresponding  prediction  of  g^  can 

be  written  as  a  linear  combination  of  the  other  measurement  values  g. ,  as 
-i  N 

in  g  (pred.)  -  —  £  (a  g  ).  The  measurement  value  g  can  be  calculated 

1  ai  i=l,i±£  1 

by  multiplying  eq.  (13)  through  by  a^,  summing  all  the  i?s,  and  readjusted 

the  terms,  we  get: 

N  ™|  N  (  N 

gQ  +  “  £  a  e  =  ~  Z  a  g  4-  {  ^  I  [2  a  k  (x)]f(x)d) 

x  i=l  11  al  i=i  11  aZ  J  1=1  1  1 

L  J  1*1  (14) 

Apparently  the  first  term  on  the  right  hand  side  of  eq.  (14)  is  g^(pred.)  (an 
estimate  of  g^) .  In  this  case,  one  can  estimate  g^  closer  than  one  can 
measure  it,  if  the  second  term  on  the  right  hand  side  of  the  eq.  (14)  is 


less  than  or  equal  to  the  error  term  (the  second  term)  on  the  left  hand 

side  of  the  equation,  i.e.  if 
rb 

|  [Z  aiki(x)  ]  •  f  (x)dx|<  |  £  a  (15) 

a  *  * 


Applying  the  Schwarz’s  inequality  and  the  mean  value  theorem  on  the  left 
hand  side  of  the  inequality,  eq .  (15)  we  find  that: 


b  N  b  N 

|  /  Z  a.k  (x) • f (x)dx |  24  |  /  Z  a.k  (x)dx| 2 • | f  (x) | 2 
a  i=l  a  i=l  11  m 

b  N 

The  minimized  quantity,  \J  £  a.k.(x)dx|2  is  the  smallest  eigenvalue 

a  i=l  11 

A  of  the  covarience  matrix  C  multiplyed  by  a  constant  which  is  determined  by  the 
m 

integral  limits.  Thus,  if  |f^(x)|2  has  order  of  magnitude  one,  and  with  a 

properly  adjusted  integral  scale  [0,1],  the  upper  bound  of  this  quantity 

would  be  A  . 

m 

Again,  applying  the  Schwarz's  inequality  on  the  right  hand  side  of 
eq.  (15)  above: 

|E  a  e  |2«|I  a  2|‘|Ze  2| 

111  N 

for  an  independent  randomly distributed  error  e.,  1 1  e  2  I  =  N !  e  I2 

l  '  i  1  1  rms 1  . 

i=l 

Generally  speaking,  for  a  relative  error  \z  e^2 |  =  |^|2.  Hence,  it  is 

i 

clear  that  if  is  "less  than"  |^J2  (or  it  should  be  said  "much  less  than", 
for  there  is  considerable  uncertainty  when  they  are  the  same  order  of 
magnitude) ,  the  noise  to  signal  ratio  is  large  enough  that  information 
cannot  be  obtained. 

Now  it  is  more  interesting  to  know  exactly  which  one  or  ones  of  the 
kernels  is  predictable:  Surely  the  best  approximation  can  be  made  by 
choosing  the  weakest  response  kernel.  That  is,  for  a  given  very  small 
eigenvalue  A  ,  the  correspondent  linear  combination  of  kernels  can  be 


approximated  to  zero.  Therefore,  the  kernel  k.  ,  whose  coefficient  has  the 
largest  value,  can  be  thought  of  as  the  most  weakly  represented  base  function. 
Therefore,  the  corresponding  normalized  kemal,  k^,  should  be  the  least 
useful. 

To  make  the  case  simple,  and  to  have  a  direct  measure  for  X,  proper 
scaling  for  g,  k,  and  f  is  necessary  and  does  not.  change  the  relationship 
between  them.  On  the  other  hand,  it  provides  a  convenient  way  to  estimate 
the  relative  error.  As  pointed  out  earlier,  g^  can  be  scaled  having  an 
order  of  one,  then  the  are  the  relative  errors  and  |g|^  N  ( the  number 
of  channels).  Scaling  can  also  be  done  such  that  | f | 1  and  the  kernels 
are  normalized.  Then  eq.  (13)  can  be  written  as: 


ir  +  ir  =  /  ki(x),[i  *f(x)ldx 

a 


(16) 


Where  a,  8  are  proper  scaling  factors. 

Also,  the  integral  limits  can  be  rescaled  from  0  to  1.  Without  proper 
scaling  there  can  be  confusion  between  the  comparison  of  eigenvalues  and 
the  noise  levels,  which  has  been  pointed  out  by  Twomey  (1974)  in  his  earlier 
papers . 

Such  eigenvalue  techniques  can  also  be  used  to  directly  analyze  the 

unknown  function,  f(x),  by  introducing  a  new  set  of  orthonormal  functions 

rf(x)  on  which  f(x)  can  be  projected,  thenf(x)  =  ££  <J> .  (x) .  However,  such 

j  J  J 

a  new  orthonormal  set  of  <{> ^  (x)  must  satisfy  eq .  (11)  and  should  be  constrained 

to  k^(x).  Therefore  in  general,  (x)  is  chosen  as  a  linear  combination  of 

all  the  kernels,  and  the  normalization  constraint  will  determine  the 

correspondent  coefficients.  It  turns  out  that  the  best  choice  of  (x)  is 

1  N 

— Z  ,  where  A  is  the  ith  element  of  the  eigenvector  associated 


ij 


VJi.i  «  l* 

with  eigenvalue  X^  of  the  covarience  matrix  C.  Written  in  matrix  form 
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-1/2 

it  will  become  ^(x)  «UA  *k(x).  With  this  substitution  in  the  equation 

'V  % 

(11) ,  one  obtains : 

-1/2 

£  =  A  7  * U**  g  (17) 

'X,  'x 

and  f (x)  becomes: 

f (x)  =  k*(x)*U-A“1-U*-g  (18) 

a,  % 

Obviously,  small  eigenvalues  involved  in  such  an  inversion  for  f(x) 
will  make  it  very  unstable.  But  it  provides  a  straightforward  way  to 
investigate  the  error  magnification  which  is  also  a  criterion  in  determining 
the  information  content  through  deleting  the  measurements  which  have  an 
excessive  error  magnification.  (Detailed  descriptions  can  be  referred  in 
Twomey  (1977.) 

Although  we  were  concentrating  on  analyzing  the  information  content 
of  the  kernels  here,  the  same  method  can  be  applied  directly  to  the 
measurement  data,  which  has  been  done  in  many  cases  (e.g.  Mateer,  1965).  If 
there  is  one  eigenvalue  which  is  less  than  | e j 2,  one  measurement  can  be  predicted 

'X/ 

better  than  measured  and  the  information  content  should  be  reduced  by  one. 

If  there  are  p  eigenvalues  which  are  less  than  |^|2,  there  should  be  p 
redundant  measurements.  For  a  total  of  N  measurements,  the  number  of  pieces 
of  independent  information  will  become  (N-p) . 

4 .  Results  and  Discussion 

The  previous  analysis  has  been  applied  to  investigate  the  information 
content  of  the  spectral  output  of  a  microwave  radiometer  used  to  detect 
the  H2O  content  in  the  stratosphere  and  mesosphere.  This  spectral  output 
totals  49  channels,  each  separated  by  50  KHz,  ranging  from  22.2362798  GHz 
to  22,2338798  Ghz,  and  centered  at  22.2350798  GHz.  Since  symmetry 
exists  about  the  center  frequency,  the  information  content  analysis  need 
only  to  be  done  for  25  channels.  The  altitude  range  was  chosen  from  50  Km 


V 
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to  86  Km.,  which  represents  thirty-six,!  Km  layers. 

Let  us  first  consider  the  emission  experiment  case.  The  25  kernels, 
which  have  been  normalized  to  unit  area  for  convenience  (Fig.  1),  were  used 
to  do  the  eigenvalue  and  eigenvector  analysis  of  their  covarience  matrix  C. 

The  resulting  eigenvalues,  as  listed  in  Table  1  (only  for  the  ones  whose 
magnitudes  are  greater  than  10  g),  show  that  for  a  given  1%  relative 
r.m.s.  error,  four  independent  pieces  of  information  could  safely  be  drawn, 
however  the  fifth  may  be  possible  as  well.  The  rapidly  decreasing  magnitude 
of  the  eigenvalues  indicates  that  improving  experimental  accuracy  doesn’t 
provide  much  more,  if  any,  new  information.  In  this  case,  many  of  the  25 
measurements  will  be  redundant. 

The  kernels  which  contribute  the  most  informations  are  k^,  k^,  ^22’ 
k^^,  and  to  a  lesser  extent  k^,  where  the  subscripts  represent  the 
corresponding  frequencies:  22.2338798,  22.344798,  22.2349298,  22.2350798 
and  22.2350298  GHz,  respectively  (Table  II).  Their  relationship  with  height 
have  been  plotted  in  Fig.  2.  Since  the  original  kernel  functions  are  reasonably 
smooth  and  very  much  overlapped  (as  shown  in  Fig.  1),  it  would  not  be 
surprising  that  they  have  such  limited  independence.  This  means  the 
measurement  can  be  done  as  well  based  on  these  four  (or  five)  channels  as 
with  all  the  original  25  channels  to  within  an  experimental  uncertainty 
of  about  1%.  To  make  this  point  more  explicitly,  the  same  eigenvalue 
analysis  was  applied  to  an  arbitary  set  of  ten  channels;  specifically 
channel  numbers  2,  5,  8,  11,  14,  17,  20,  23,  24  and  25  were  chosen.  The 
results  are  also  listed  in  Table  I.  For  the  same  noise  level,  apparently 
four  pieces  of  information  is  derivable.  Under  such  circumstances,  it  is 
obvious  that  making  more  measurements  does  not  improve,  by  much,  knowledge 
concerning  the  inaccessible  profile  f. 
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In  Figure  3  we  show  that  the  number  of  independent  pieces  of 
information  which  can  be  derived  for  various  experimental  error  levels.  If 
the  error  lies  beyond  5%,  there  will  be  only  two  pieces  of  information 
that  could  be  inferred  from  such  25  measurements.  Therefore,  even  though 
the  discussion  of  error  levels  cannot  be  precise,  the  number  of  independent 
pieces  of  information  is  still  quite  apparent  as  long  as  the  signal  to 
noise  ratio  is  much  greater  than  1.  Therefore,  increasing  the  number  of 
measurements  or  improving  measurement  accuracy  may  not  increase  the 
information  content  considerably. 

5 .  Zenith  Angle  Effects 

The  discussion  above  was  based  upon  calculations  for  zero  zenith 
angle  operation.  In  order  to  increase  the  signal  to  noise  ratio,  the 
radiometer  should  be  operated  at  lower  elevation  angles  (i.e.  to  obtain 
a  longer  slant  optical  path) .  Therefore,  the  same  analysis  has  also  been 
applied  to  the  case  of  the  same  25  channels,  but  with  a  70°  zenith  angle. 

The  resulting  eigenvalues  were  extremely  close  to  the  first  case,  thus 
for  the  same  relative  error  it  should  provide  the  same  number  of  independent 
pieces  of  information.  However,  the  channels  which  contribute  the  most 
information  do  tend  to  be  redistribed  slightly  toward  the  center  frequency 
channel.  Since  the  discussion  of  information  content  is  based  on  the 
competition  between  the  relative  error  level  and  the  eigenvalues  of  the 

measurement  kernels,  one  can  only  expect  that  lowering  the  elevation  angle 
will  reduce  the  noise  to  signal  ratio  and  may  provide  one  or  more  additional 
pieces  of  information. 

The  same  70°  zenith  angle  dependence  for  the  absorption  case  has 
been  done,  and  the  previous  argument  still  holds. 
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6 .  Summary  and  Conclusion 

The  information  content  of  a  microwave  radiometer  experiment  has 
been  investigated.  Since  the  kernel  (or  weighting)  functions  are 
reasonably  smooth  and  very  much  overlapped,  the  number  of  independent 
pieces  of  information  is  much  less  than  the  total  number  of  possible 
measurement  channels.  This  means  that  if  one  tries  to  use  all  the  channels 
in  performing  the  observation,  many  of  the  measurements  would  be  redundant. 
As  for  how  many  independent  pieces  information  can  be  drawn,  this  depends 
on  the  relative  error  of  the  whole  experiment  which  can  be  reduced  by 
lowering  the  operating  elevation  angle  or  improving  the  instrument  itself. 
One  may  argue  that  taking  more  data  points  certainly  has  some  value,  but 
such  improvements  may  not  be  significant  enough  to  provide  additional 
information;  in  addition  the  cost  of  taking  and  processing  more  measure¬ 
ments  may  be  too  high.  Also,  the  difficulties  of  the  actual  inversion 
process  are  magnified  by  highly  dependent  kernels,  thus  it  is  worthwhile 
to  examing  the  information  content  and  use  such  results  as  a  guide. 

For  the  current  Penn  State  system  four,  or  perhaps  five,  independent 
pieces  of  information  are  attainable  with  maintenance  of  reasonably 
system  accuracy. 


Table  I 

(Eigenvalues  X  and  Corresponding  Channel/Kernel  Numbers  k) 


Order  of  X 

for  10 

measurements 

for  25 

measurements 

X 

k 

X 

k 

1 

2.99 

X 

10“ 1 

(k.25) 

8.40 

X 

10“ 1 

(k25) 

2 

9.81 

X 

10“2 

<k2> 

1.33 

X 

10“ 1 

(k ! ) 

3 

1.73 

X 

io“2 

(k23) 

3.66 

X 

10“2 

(k22) 

4 

3.17 

X 

10“  3 

(^17) 

5.69 

X 

10“ 3 

(ki  3) 

5 

3.47 

X 

10“4 

(k-24) 

7.93 

X 

10“ 4 

(k24 ) 

6 

5.64 

X 

10"5 

(k2o) 

1.11 

X 

10"4 

(kis) 

7 

5.34 

X 

io“6 

(k5) 

1.37 

X 

10“5 

(k23) 

8 

1.48 

X 

10“7 

<ki4> 

1.34 

X 

10“6 

(k3) 

9 

2.02 

X 

10“9 

(kn) 

1.06 

X 

10“7 

(k2i) 

10 

* 

6.78 

X 

10“9 

(k  1 0  ) 

*Eigenvalues  not  included  in  this  Table  have  magnitude  much  less  than 
1(T9  . 
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Channel /Kernel 
Humber 


Table  II  (Channel /Kernel  Frequency) 


Frequency 

(GHz) 


Frequency  Offset 
(MHz) 


kl 

k2 

k3 

k4 

k5 

k6 

k7 

k8 

kg 

kl  0 


kl  3 
kl  4 
kl  5 
kl  6 
kl  7 
kl  8 
kl  9 
k2  0 
k2  1 
k22 
k2  3 
k24 
k2  5 


22.2338798 

22.2339298 

22.2339798 

22.2340298 

22.2340798 

22.2341298 

22.2341798 

22.2342298 

22.2342798 

22.2343298 

22.2343798 

22.2344298 

22.2344798 

22.2345298 

22.2345798 

22.2346298 

22.2346798 

22.2347298 

22.2347798 

22.2348298 

22.2348798 

22.2349298 

22.2349798 


1.20 

1.15 

1.10 

1.05 

1.00 

0.95 

0.90 

0.85 

0.80 

0.75 

0.70 

0.65 

0.60 

0.55 

0.50 

0.45 

0.40 

0.35 

0.30 

0.25 

0.20 

0.15 

0.10 

0.05 


22.2350798 


0.00 


Table  II  (Channel /Kernel  Frequency) 


Channe  1  /  Ke  me  1 
Number 


Frequency 

(GHz) 


Frequency  Offset 
(MHz) 


k2 

k3 

k4 

k5 

k6 

k7 

k8 


k9 
^10 
kl  1 
k12 
kl  3 
k  1 4 
kl  5 
kl  6 
kl  7 
kl  8 
kl  9 
k2  0 
k2 1 
k22 
k2  3 
k24 
k25 


22.2338798 

1.20 

22.2339298 

1.15 

22.2339798 

1.10 

22.2340298 

1.05 

22.2340798 

1.00 

22.2341298 

0.95 

22.2341798 

0.90 

22.2342298 

0.85 

22.234279 8 

0.80 

22.2343298 

0.75 

22.2343798 

0.70 

22.2344298 

0.65 

22.234479 8 

0.60 

22.2345298 

0.55 

22.2345798 

0.50 

22.2346298 

0.45 

22.2346798 

0.40 

22.2347298 

0.35 

22.2347798 

0.30 

22.2348298 

0.25 

22.2348798 

0.20 

22.2349298 

0.15 

22.2349798 

0.10 

22.2350298 

0.05 

i 


22.2350798 


0.00 
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The  25  normalized  kernels  plotted  against  height,  zero  zenith 
angle,  emission  case. 


Fig.  2:  The  four  independant  (information  containing;  Kernels 

plotted  versus  height  (solid  curves);  a  possible  fifth, 
information  containing,  kernel  (unconnected  symbols). 


NO.  OF  INDEPENDENT  KERNELS 


2- 


0 


NO.  OF  MEASUREMENTS 

Fig.  3:  The  estimated  number  of  independant  kernels, 

dependant  upon  the  number  of  measurements,  for 
different  possible  error  levels. 
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